Regularized Max Pooling for Image Categorization

نویسنده

Minh Hoai

چکیده

We propose Regularized Max Pooling (RMP) for image classification. RMP classifies an image (or image region) by extracting feature vectors at multiple subwindows at multiple locations and scales. Unlike Spatial Pyramid Matching where the subwindows are defined purely based on geometric correspondence, RMP accounts for the deformation of discriminative parts. The amount of deformation and the discriminative ability for multiple parts are jointly learned during training. An RMP model is a collection filters. Each filter is anchored to a specific image subwindow and associated with a set of deformation coefficients. The anchoring subwindows are predetermined at various locations and scales, while the filters and deformation coefficients are learnable parameters of the model. Fig. 1 shows a possible way to define subwindows. To classify a test image, RMP extracts feature vectors for all anchoring subwindows. The classification score of an image is the weighted sum of all filter responses. Each filter yields a set of filter responses, one for each level of deformation. The deformation coefficients are the weights for these filter responses. Given a set of images {Ii} n i=1 and labels {yi|yi ∈ {1,−1}} n i=1 , consider a particular set of geometrically defined subwindows which can encode semantic content of an image at different locations and scales (e.g., Fig 1). Let {I j}mj=1 denote the set of subwindows for image I. Let φ be the feature function of which the input is an image region and the output is a column vector. Let D j be the feature matrix computed at location j for all images and K j the corresponding kernel, i.e., D j = [φ(I j 1) · · ·φ(I j n)] and K j = (D ) D j . The joint kernel for all subwindows is the sum of all kernels: K = ∑j=1 K ; this corresponds to concatenating all feature vectors computed at all subwindows. Given the kernel K, we train an Least-Squares SVM and obtain a coefficient vector and bias term α ,b. The filter for subwindow j can be computed as w j = D α . For a particular subwindow j and an image I, the regularized maximum score is defined:

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Minh Hoai: Regularizedmax Pooling for Image Categorization

We propose Regularized Max Pooling (RMP) for image classification. RMP classifies an image (or an image region) by extracting feature vectors at multiple subwindows at multiple locations and scales. Unlike Spatial Pyramid Matching where the subwindows are defined purely based on geometric correspondence, RMP accounts for the deformation of discriminative parts. The amount of deformation and the...

متن کامل

Emergence of Selective Invariance in Hierarchical Feed Forward Networks

Many theories have emerged which investigate how invariance is generated in hierarchical networks through simple schemes such as max and mean pooling. The restriction to max/mean pooling in theoretical and empirical studies has diverted attention away from a more general way of generating invariance to nuisance transformations. In this exploratory study, we study the conjecture that hierarchica...

متن کامل

Joint Dictionary and Classifier Learning for Categorization of Images Using a Max-margin Framework

The Bag-of-Visual-Words (BoVW) model is a popular approach for visual recognition. Used successfully in many different tasks, simplicity and good performance are the main reasons for its popularity. The central aspect of this model, the visual dictionary, is used to build mid-level representations based on low level image descriptors. Classifiers are then trained using these mid-level represent...

متن کامل

Multiple spatial pooling for visual object recognition

Global spatial structure is an important factor for visual object recognition but has not attracted sufficient attention in recent studies. Especially, the problems of features' ambiguity and sensitivity to location change in the image space are not yet well solved. In this paper, we propose multiple spatial pooling (MSP) to address these problems. MSP models global spatial structure with multi...

متن کامل

Efficient Multiclass Implementations of L1-Regularized Maximum Entropy

This paper discusses the application of L1-regularized maximum entropy modeling or SL1-Max [9] to multiclass categorization problems. A new modification to the SL1-Max fast sequential learning algorithm is proposed to handle conditional distributions. Furthermore, unlike most previous studies, the present research goes beyond a single type of conditional distribution. It describes and compares ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Regularized Max Pooling for Image Categorization

نویسنده

چکیده

منابع مشابه

Minh Hoai: Regularizedmax Pooling for Image Categorization

Emergence of Selective Invariance in Hierarchical Feed Forward Networks

Joint Dictionary and Classifier Learning for Categorization of Images Using a Max-margin Framework

Multiple spatial pooling for visual object recognition

Efficient Multiclass Implementations of L1-Regularized Maximum Entropy

عنوان ژورنال:

اشتراک گذاری